Genotype imputation via matrix completion.
نویسندگان
چکیده
Most current genotype imputation methods are model-based and computationally intensive, taking days to impute one chromosome pair on 1000 people. We describe an efficient genotype imputation method based on matrix completion. Our matrix completion method is implemented in MATLAB and tested on real data from HapMap 3, simulated pedigree data, and simulated low-coverage sequencing data derived from the 1000 Genomes Project. Compared with leading imputation programs, the matrix completion algorithm embodied in our program MENDEL-IMPUTE achieves comparable imputation accuracy while reducing run times significantly. Implementation in a lower-level language such as Fortran or C is apt to further improve computational efficiency.
منابع مشابه
Estimation of genotype imputation accuracy using reference populations with varying degrees of relationship and marker density panel
Genotype imputation from low-density to high-density (SNP) chips is an important step before applying genomic selection, because denser chips can provide more reliable genomic predictions. In the current research, the accuracy of genotype imputation from low and moderate-density panels (5K and 50K) to high-density panels in the purebred and crossbred populations was assessed. The simulated popu...
متن کاملSparRec: An effective matrix completion framework of missing data imputation for GWAS
Genome-wide association studies present computational challenges for missing data imputation, while the advances of genotype technologies are generating datasets of large sample sizes with sample sets genotyped on multiple SNP chips. We present a new framework SparRec (Sparse Recovery) for imputation, with the following properties: (1) The optimization models of SparRec, based on low-rank and l...
متن کاملاهمیت خویشاوندی ژنتیکی و رکورد فنوتیپی بر صحت ژنومی دادههای جانهی شبیه سازی شده با استفاده از مدل های حیوانی در حضور اثرات متقابل ژنوتیپ و محیط
The objective of this study was to investigate the role of genetic relationships between training and validation set with considering different ratio of phenotypic records of training set on accuracy of genomic prediction via animal models containing genotype × environment interactions in simulated imputation data. For this purpose, four different scenarios using 15k density containing differen...
متن کاملImputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method
The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed a...
متن کاملOrthogonal Rank-One Matrix Pursuit for Matrix Completion
Low rank modeling has found applications in a wide range of machine learning and data mining tasks, such as matrix completion, dimensionality reduction, compressed sensing, multi-class and multi-task learning. Recently, significant efforts have been devoted to the low rank matrix completion problem, as it has important applications in many domains including collaborative filtering, Microarray d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genome research
دوره 23 3 شماره
صفحات -
تاریخ انتشار 2013